Word Spotting in Cursive Handwritten Documents Using Modified Character Shape Codes
نویسنده
چکیده
There is a large collection of Handwritten English paper documents of Historical and Scientific importance. But paper documents are not recognised directly by computer. Hence the closest way of indexing these documents is by storing their document digital image. Hence a large database of document images can replace the paper documents. But the document and data corresponding to each image cannot be directly recognised by the computer. This paper applies the technique of word spotting using Modified Character Shape Code to Handwritten English document images for quick and efficient query search of words on a database of document images. It is different from other Word Spotting techniques as it implements two level of selection for word segments to match search query. First based on word size and then based on character shape code of query. It makes the process faster and more efficient and reduces the need of multiple pre-processing.
منابع مشابه
Connected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملKeyword Spotting from Online Chinese Handwritten Documents using One-versus-All Character Classification Model
In this paper, we propose a method for text-query-based keyword spotting from online Chinese handwritten documents using character classi ̄cation model. The similarity between the query word and handwriting is obtained by combining the character classi ̄cation scores. The classi ̄er is trained by one-versus-all strategy so that it gives high similarity to the target class and low scores to the oth...
متن کاملA Dynamic Programming Method for Segmentation of Online Cursive Uyghur Handwritten Words into Basic Recognizable Units
Correct and efficient segmentation of Uyghur words into characters is crucial to the successful recognition. However, little work has been done in this area. There are many connected characters in cursive Uyghur handwriting, which makes the segmentation and recognition of Uyghur words very difficult. To enable large vocabulary Uyghur word recognition using character models, we propose a charact...
متن کاملKeyword spotting in unconstrained handwritten Chinese documents using contextual word model
a r t i c l e i n f o Keywords: Keyword spotting Chinese handwritten documents Word similarity Contextual word model This paper proposes a method for keyword spotting in off-line Chinese handwritten documents using a contextual word model, which measures the similarity between the query word and every candidate word in the document by combining a character classifier and the geometric context a...
متن کاملAn Effective Character Separation Method for Online Cursive Uyghur Handwriting
There are many connected characters in cursive Uyghur handwriting, which makes the segmentation and recognition of Uyghur words very difficult. To enable large vocabulary Uyghur word recognition using character models, we propose a character separation method for over-segmentation in online cursive Uyghur handwriting. After removing delayed strokes from the handwritten words, potential breakpoi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012